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Summary 

Current models of human visual search have 
extended the traditional serial/parallel 
search dichotomy. Two successful models 
for predicting human visual search are the 
Guided Search model and the Signal 
Detection Theory model. Although these 
models are inherently different, it has been 
difficult to compare them because the 
Guided Search model is designed to predict 
response time, while Signal Detection 
Theory models are designed to predict 
performance accuracy. Moreover, current 
implementations of the Guided Search 
model require the use of Monte-Carlo 
simulations, a method that makes fitting the 
model’s performance quantitatively to 
human data more computationally time 
consuming. We have extended the Guided 
Search model to predict human accuracy in 
target-localization search tasks. We have 
also developed analytic expressions that 
simplify simulation of the model to the 
evaluation of a small set of equations using 
only three free parameters. This new 
implementation and extension of the Guided 
Search model will enable direct quantitative 
comparisons with human performance in 
target-localization search experiments and 
with the predictions of Signal Detection 
Theory and other search accuracy models. 

Introduction 

In standard visual-search tasks, the observer 
looks for a target among a set of distractors. 
When the target differs greatly from the 
distractors along a single feature dimension 
(e.g. contrast, orientation, color, etc.), the 
time to find the target is relatively constant 
as a function of the number of elements (set 
size). This fact is commonly interpreted as 
evidence for parallel search (i.e. the 
simultaneous examination of potential 
targets). Alternatively, when the target is 
similar to the distractors, the distractors are 
heterogeneous, or the target can be 
differentiated from the distractors only by 
the combination of two or more feature 
dimensions (conjunctions), the response 


time increases drastically with set size. This 
finding is commonly interpreted as evidence 
for a temporally serial search (i.e. the 
sequential examination of potential targets). 
Although the parallel/serial dichotomy has 
dominated research for more than two 
decades (refs. 1-4), and is central to many 
theories of visual search (e.g., Feature 
Integration Theory; ref. 5), more recently, 
models of visual search have moved away 
from the original strict serial/parallel 
dichotomy. 

There have been two traditions in studying 
the processes mediating visual search. One 
approach has been to allow extended 
viewing of the stimulus and to measure 
observer response times (e.g., refs. 6-8), 
while a second approach has been to use 
fixed duration displays and to measure 
detection accuracy, the probability of 
correctly detecting the presence of the target 
(e.g., refs. 9, 10). The response-time results 
can be predicted by a two-stage Guided 
Search (GS) model (refs. 6, 1 1 ) in which an 
initial parallel system guides a subsequent 
serial-search stage. The accuracy results 
can be predicted by a single-stage Signal 
Detection Theory (SDT) model (ref. 1 2) in 
which processing is parallel but noisy (e.g. 
refs. 9, 10, and 13-16). Even though the GS 
and SDT models are based on 
fundamentally different assumptions about 
human visual information processing, 
progress in our understanding of search has 
been hampered by the difficulty associated 
with directly comparing these two models 
because they were developed for different 
experimental paradigms and are applicable 
to different empirical measurements. A 
second difficulty is that current 
implementations of the GS model (ref. 1 1) 
require Monte-Carlo simulations, which 
make fitting the model to human data more 
computationally time consuming. 

This report describes an analytic extension 
of the GS model, the Guided Search 
Accuracy (GSA) model, which predicts 
performance accuracy in a target- 



localization search task as a function of set 
size. We develop analytic mathematical 
expressions that allow quantitative fitting of 
the model to human data in a time-efficient 
manner using only three free parameters. 

The significance of this implementation and 
extension is that it will allow the direct and 
quantitative comparison of Guided Search 
and Signal Detection Theory models in 
target-localization search tasks. 

Theory 

The Guided Search model 

The GS model (ref. 1 1) was developed to 
predict response times as a function of set 
size in one type of visual search task. In 
this type of search experiment, there are two 
kinds of displays: target-present and target- 
absent. Each consists of N elements. In 
target-present trials, one element, the target, 
differs from the others, the distractors, while 
in target-absent trials, all the elements are 
distractors. The observer’s task is to search 
the display to determine whether it is a 
target-present trial or target-absent trial and, 
as quickly as possible, to make a response 
indicating the decision. The dependence of 
the response times on the set size is 
measured. 

The GS model assumes that each element is 
processed by broadly tuned channels that 
correspond to categorical features (e.g. 
“red”, “green”, “bright”, etc.). The bottom- 
up response of each element is determined 
by a weighted average of the difference in 
output between that element and its 
neighbors. The top-down response is a 
function of the match of the element to the 
designated target. The final internal 
response associated with a given element is 
a weighted sum of the top-down and 
bottom-up responses. This final response is 
perturbed by the addition of internal neural 
noise, assumed to be Gaussian, to yield a 
final "activation" for each element. Finally, 
visual attention serially searches through 
those elements, whose activations are above 


a threshold (X), according to a self- 
terminating procedure. Unlike the standard 
serial search model where visual attention 
proceeds randomly from one element to 
another (ref. 5), in the GS model, visual 
attention proceeds in an activity dependent 
order. It begins with the element that 
elicited the highest activation and continues 
in order of decreasing activation. The search 
terminates when either the target is found, 
or no elements remain with activation above 
the activation threshold. Rejected elements 
are not revisited. It is assumed that 
attending to each element requires a fixed 
amount of processing time, and thus that the 
response time is determined by the total 
number of elements searched, and also that 
if the target is attended, the observer always 
correctly identifies that trial as a target- 
present trial. If the search is terminated 
without processing the target (because the 
target did not exceed threshold), the 
observer guesses “target absent” 97 % of the 
time and “target present” 3 % of the time. 

The Guided Search Accuracy model 

We have developed an analytic extension of 
the GS model, the GSA model (Fig. 1), 
which predicts accuracy in a localization 
search task. In localization search tasks, a 
fixed-duration display containing a single 
target and a number of distractors is 
presented to the observer. The observer 
then reports which of the N locations 
contains the target. The accuracy of 
correctly identifying the target location is 
measured as a function of set size (N). The 
structure of the GSA model is nearly 
identical to the GS model, except that to 
predict localization accuracy for fixed 
stimulus durations, the serial attention stage 
of the GSA model is restricted to examining 
a fixed number of elements. If the target 
has not been found within the restricted 
presentation time, the model is forced to 
guess. 

The GSA model consists of two stages: a 
noisy parallel-processing stage and a noise- 
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free serial-attention stage. In the parallel- 
processing stage, each element in the 
display elicits a noisy response, its 
activation. We assume that each element’s 
activation can be described by a Gaussian 
probability distribution, and that the target 
and distractor distributions have equal 
variances'. A target elicits, on average, a 
larger activation than a distractor (Fig. 2). 
The target-distractor discriminability is d’, a 
measure of the distance between the target 
and distractor distributions. The results of 
this parallel processing stage are then sent 
to the noise-free serial attention stage, that 
first orders the supra-threshold elements 
(those with activations greater than the 
threshold, X) according to their activation. 
Then, visual attention serially processes the 
supra-threshold elements, beginning with 
the element with the highest activation and 
continuing in decreasing order of activation 
(Fig. 1). Processing each element with 
visual attention requires a fixed amount of 
time and processing continues until all 
supra-threshold elements have been 
processed or the target is found, or until the 
display presentation ends. If visual 
attention processes the target, the model 
always correctly identifies the location of 
the target (even if its activation from the 
initial parallel processing stage was by 
chance lower than that of a distractor) 2 If 
the target was not processed during the 
display presentation, the GSA model is 
forced to guess. If all supra-threshold 
elements have not been processed, the 
model chooses the remaining supra- 
threshold element with the highest 
activation. Otherwise (if all supra-threshold 
elements have been processed), it chooses 


'The equal-variance assumption is included to reduce 
the number of free parameters in the GSA model, but 
can be relaxed by adding a parameter specifying the 
ratio of the variance of the target distribution to that of 
the distractor. 

2 ln the both the GS and GSA models and in other 
models with a serial attention mechanism, visual 
attention is assumed to be a homunculus that can 
determine without error the identity of an attended 
element. 


randomly among all remaining sub- 
threshold elements. Errors can occur when 
there are either more supra-threshold 
elements than can be processed serially 
prior to the display being terminated, or 
when the activation of the target is sub- 
threshold. 

Simplification of the GSA model 

To facilitate fitting the analytic GSA model 
to human performance data, we have 
simplified the original model (ref. 1 1) by 
reducing the number of free parameters. 
First, because in most target-localization 
search tasks, the observer is searching for an 
a priori known target and not an odd-man 
out, we assume that the responses of the 
parallel processing stage are entirely 
determined by the similarity of an element 
to the known target (top-down activation). 
Thus, we assume that the contribution of the 
bottom-up processing is negligible; either 
the output associated with an element does 
not depend on a weighted average of the 
difference of filter outputs for that element 
and its neighbors (no lateral inhibition), or if 
it does, this effect does not vary across set- 
size conditions (constant lateral inhibition).’ 
Therefore the GSA model is applicable to 
search displays that contain widely spaced 
elements or that maintain a constant inter- 
element distance for all set sizes. Second, 
we assume that the activation threshold and 
the maximum number of elements serially 
processed are constant across trials. All 
other aspects of the model are identical to 
those of the Guided Search model 2.0 (ref. 
11 ). 


’The GSA model could be extended to include effects 
of lateral inhibition if an appropriate model of how 
target detectability varies as a function of element 
density is specified. Alternatively, lateral interactions 
between the activity elicited by each element can be 
avoided if the elements are far apart such that the 
distance between elements is large relative to their 
size. 
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Analytic implementation of the GSA 
model 

We have developed analytic expressions for 
the performance accuracy (% correct) of the 
GSA model. Because both the target and 
distractor distributions are equal variance 
Gaussians, without loss of generality, we 
can rescale these distributions so that: 1) 
both target and distractor distributions have 
unit variance, 2) the mean distractor 
activation is zero, 3) the mean target 
activation is d'. Thus, the probability of a 
distractor producing an activation x is: 

d(x) = - 7 L=exp(-x 2 12) (1) 

V In 

The probability of the target producing an 
activation x is: 

t(x) = -^=exp(-(x-d’) 2 /2) (2) 

V In 

We denote the cumulative probability of a 
distractor producing an activation less than 
x by 

X 

£*< x) = J d(x)dx , (3) 


the probability of a distractor producing an 
activation greater than x by 

D(> *) = \d(x)dx, (4) 


the probability of the target producing an 
activation less than x by 


T(< x)= \t(x)dx. 


and the probability of the target producing 
an activation greater than x by 

OO 

T{> x) = \t(x)dx. (6) 


We compute the probability of the GSA 
model correctly locating the target in two 
separate regimes: 1) when the target 
activation is supra-threshold (i.e. larger than 
X), 2) when the target activation is sub- 
threshold. Because, in many experiments, 
the display duration is fixed (usually brief: 
50-300 ms) and examining each element 
requires a fixed amount of processing time, 
there is time to serially examine at most k 
elements, which may be less than the total 
number of supra-threshold elements. Thus, 
we assume that on each trial, at most k 
elements are serially examined, and that k is 
less than or equal to N (if k>= N, there is 
time to search all the elements). If the 
target is examined, the model always 
correctly chooses it 2 . Otherwise, the model 
chooses the element with the next highest 
activation threshold, or if none of the 
remaining elements is supra-threshold, it 
randomly guesses among the sub-threshold 
elements. 

In the first regime, when the target 
activation is supra-threshold, there are two 
ways the model can correctly identify the 
target: 1) if the target ranks among the k 
highest activations, it is guaranteed to be 
processed and thus is always correctly 
located, 2) if the target has the k+1 highest 
activation, it is next in line at the end of the 
trial, so the model correctly chooses it 4 


4 Instead of choosing the k+l* element, a variation of 
the model could guess among all unprocessed 
elements. This latter version of the model produces 
slightly worse performance for a given set of 
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Otherwise, it incorrectly chooses the 
distractor, which is next in line. Thus, for 
supra-threshold targets, the target is always 
correctly chosen if it is among the k+1 
highest activation, and is never correctly 
chosen if it is not. To explicitly compute 
the probabilities, it is necessary to consider 
all possible permutations of distractor 
labeling, which is performed in the factorial 
terms of the subsequent equations. 

The probability of a correct response given 
that the target is supra-threshold and is 
among the k+1 highest activations (Pc,) is 
the sum from j = 1 to k+1 of the probability 
of the target being supra-threshold and 
being the j"’ highest activation. Each term is 
the product of the probability of the target 
taking a value x, the probability of exactly j- 
1 distractors taking a value larger than x, the 
probability of exactly N-j distractors taking 
a value less than x, times a binomial 
coefficient describing the number of 
possible distractor permutations: 

(7) 

Jfc + j +<*> 

p * = ZC ’) J *)r‘[£(< 4 N ~ 

>=' A 

Therefore, percent correct localization for 
the case in which the target activation is 
supra-threshold can be calculated as the sum 
of the probabilities of the target activation 
ranking among the k+1 highest activations 
(Pc,). 

In the second regime, the target activation is 
sub-threshold, and is sometimes correctly 
guessed from all the unprocessed elements 
(Pc 2 ). If the number of supra-threshold 
distractors, j, is greater than k, the model 
processes k distractors and then chooses the 
distractor next in line, and so is never 
correct. If the number of supra-threshold 
distractors, j, is less than or equal to k, the 


parameters than the version presented here. However, 
the two models have virtually identical sensitivity to 
variations in the parameters. 


model first processes all j supra-threshold 
distractors and then randomly chooses one 
of the N-j sub-threshold elements. 

Therefore, it is correct only if this random 
choice is the target, which occurs with a 
probability of l/(N-j). The probability of a 
correct guess (Pc 2 ) is then the sum over j 
equal 0 to k of l/(N-j) times the product of 
the probability of the target being sub- 
threshold, the probability of exactly j 
distractors being supra-threshold, and a 
binomial coefficient describing all possible 
distractor permutations: 

( 8 ) 

pc 2 = t t'}h ■ n< *)[0(> *)]'[£>(< A)f- W> 

j=0 

Because the two possibilities described by 
Eq. 7 and 8 are mutually exclusive, the total 
percent correct in the localization of the 
target for the GSA model is the sum of these 
two independent probabilities: 

Pc = Pc, + Pc 2 (9) 


Effect of the activation threshold 

The activation threshold is a primary cause 
of performance errors for the GSA model. 
Figure 3 shows the probability distributions 
of the number of supra-threshold distractors 
for a display with 6 distractors (N=7) for 
four different values of the activation 
threshold ([mel]A.= -10,0, 1,2). The 
probabilities of the target (d’=l) exceeding 
the threshold are 0.98, 0.84, 0.5 and 0.17, 
respectively, for these activation thresholds. 
When there is no activation threshold (or it 
is very negative) then the activations of all 6 
distractors are always supra-threshold. As 


Results 

We investigated the model’s performance 
(Pc) as a function of its three free 
parameters: the activation threshold (X), the 
maximum number of elements that can be 
processed serially within the presentation 
time (k), and the discriminability between 
the target and distractors (d*). 
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the threshold is increased, the expected 
number of distractors exceeding threshold 
decreases and for high thresholds the shape 
of the distribution also changes (Fig. 3). 
Similar decreases in the probability of the 
target’s activation exceeding threshold also 
occur. Thus, an increase in the activation 
will generally decrease performance 
accuracy for two reasons. First, it decreases 
the probability that the target is supra- 
threshold and thus that it is examined by the 
serial processor, which forces the model to 
guess more frequently. Second, it decreases 
the number of distractors that are examined 
by the serial processor and discarded from 
the guessing subset. 

For those set-size conditions in which the 
number of elements in the display is small 
so that all (or all but one) of the elements 
can be processed serially (N < k + 1), the 
activation threshold is the only source of 
performance errors. If there were no 
activation threshold, for N<k the model 
would serially process every element in the 
display and would always correctly identify 
the target (Pc =100). Similarly if N=k+1, 
the model would serially examine N-l 
elements, then correctly guess the remaining 
element (if it hadn't already found the 
target) again producing perfect 
performance. Figure 4 shows the decrease 
in performance caused by increasing the 
activation threshold for a fixed k = 4 and d’ 
= 1 .0. As the activation threshold is 
increased, there is an increasing probability 
that the target activation is sub-threshold, 
and that the model incorrectly chooses a 
distractor. If the target is sub-threshold, the 
model first examines the supra-threshold 
distractors. If the number of supra- 
threshold distractors is greater than k, then 
the model incorrectly chooses the k+1 
distractor. If the number of number of 
supra-threshold distractors is less than k, 
then the model is forced to guess randomly 
among the remaining elements. For high 
thresholds, the guessing is less accurate 
because there are a larger number of sub- 


threshold distractors. Therefore, a high 
threshold lowers performance, because it 
causes the model to guess more frequently 
and less accurately. 

Effect of the maximum number of 
elements serially processed (k) 

The time limit imposed on the serial 
allocation of visual attention is the second 
source of errors for the GSA model. If an 
element is processed in x ms and the 
processing is temporally serial, then in a 
presentation time x, the model can only 
process k = x/x elements. Therefore, if 
there are N > k+1 elements, the model will 
be unable to process all of the elements 
necessary to make a perfect decision. There 
will be a non-zero probability that the target 
is not processed by serial attention. In these 
cases the model incorrectly chooses the 
k+l 1 " element. Alternately, when there are 
N < k+1 elements, the serial processor will 
be able to process all N-l elements 
necessary for a perfect decision unless the 
target itself is sub-threshold (i.e. no errors 
will be generated due to the serial 
processing time). Figure 5 shows the 
performance accuracy as a function of set 
size for four different values of the 
maximum number of serially processed 
elements (k = 2, 4, 6, and 8). Increasing k 
changes the set size at which performance 
degrades due to the serial processing 
(inflection point in the curve) and the rate at 
which performance degrades (downward 
trend of the curve). Figure 5A shows that 
changes in k can greatly affect performance 
for a low activation threshold (A. = 0). 

Figure 5B shows the effect of varying k is 
much less dramatic for a larger activation 
threshold (X = 1). Thus, when X is high, 
most errors are generated by X itself, and 
changes in k do not affect performance as 
much. This is because X (together with N) 
determines the average number of supra- 
threshold elements. A high X limits the 
effective number of elements available to 
the serial processor. If the effective (as 
opposed to actual) number of elements is 
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large, then performance is limited largely by 
k. If it is smaller than k, then performance 
is mostly limited by X. 

Effect of target-distractor 
discriminability (d') 

In the GSA model, target-distractor 
discriminability is determined by the 
distance between the target and distractor 
distributions (Fig. 2). Decreasing the 
physical difference between the target and 
the distractors decreases the difference 
between the mean activation of the target 
(d’) and the mean activation of the 
distractor (always zero). As the target- 
distractor discriminability is reduced, the 
probability that the target does not rank 
among the k highest activations increases, 
thereby increasing errors. In addition, the 
probability that the target activation does 
not exceed threshold also increases thereby 
generating even more errors. As a result, 
decreasing target-distractor discriminability 
will reduce the GSA model performance. 
Figure 6 shows the model’s performance 
accuracy for four levels of target-distractor 
discriminability (d’ = 2.0, 1.5, 1.0, 0.5) as a 
function of set size for k=4. Figure 6A 
shows accuracy for X = 0 and Figure 6B for 
X = 2. These results show that d' is an 
important factor that influences 
performance in two ways. First, d' 
determines the overall level of performance, 
lower d' values produce less accurate 
performance for all conditions examined 
(downward shift). Second, lower d’ values 
increase the observed set-size effects. In 
particular, for high thresholds (A. = 2), as 
shown in Figure 6B, performance initially 
decreases rapidly as a function of set size, 
then decreases much more slowly. The set 
size for which this change is observed 
depends on the value of d'; high d’ values 
produce a rapid decrease in performance 
only for small set sizes, while low d' values 
produce rapid decreases in performance 
over a larger range of set sizes. 


Monte Carlo Simulations 

To verify that the analytic expressions 
above (Eqs. 1-9) accurately describe the 
GSA model, we compared the results from 
these expressions with predictions from 
standard Monte Carlo simulations of the 
sequence of probabilistic events described 
in Fig. 1 . Performance predictions for the 
GSA analytic expressions were in good 
agreement with results from the brute force 
Monte-Carlo simulations of the model 
across the range of parameter settings 
tested. 

Conclusions 

We have developed an extension of the 
Guided Search model, the GSA model, 
which uses explicit analytic expressions to 
compute accuracy in a target-localization 
task. Our implementation allows the 
performance accuracy of the Guided Search 
model to be directly compared to that of 
human observers, the SDT model, or any 
other model of target-localization accuracy. 
We explicitly investigated the effect of 
varying three model parameters (activation 
threshold, maximum number of elements 
serially processed within the presentation 
time, and the target-distractor 
discriminability) on localization accuracy as 
a function set size. The GSA model will 
facilitate the direct and quantitative 
comparison of the ability of the Guided 
Search and Signal Detection Theory models 
to explain human search performance in 
target-localization tasks. 
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Figure 1 . Schematic of the Guided Search Accuracy model. The parallel processing stage 
generates a noisy response for each element (its activation). If this activation is above a 
threshold (X), it is passed to the serial processing stage. Attention sequentially examines 
the k highest activations (only those that are suprathreshold) in descending order and 
always correctly identifies the target if was examined. If it runs out of time or supra- 
threshold elements, it guesses. If it runs out of time before running out of suprathreshold 
elements, it picks the unexamined element with the highest activation. If it runs out of 
supra-threshold elements, it randomly picks one of the subthreshold elements. 


9 








Figure 2. Probability distributions of the responses for the target and a distractor. The 
response distribution of a target whose d’ is 1 is illustrated. 



Figure 3. Probability distributions of the expected number of supra-threshold elements for 
different values of activation threshold for a fixed target-distractor discriminability (d’ = 
1.0) and set size (N= 7). 
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Figure 4. Accuracy (Pc) as a function of set size for different values of activation 
threshold (A.) for a fixed maximum number of serially processed elements (k = 4), and a 
fixed target-distractor discriminability (d’ = 1 .0). 
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